Document Text Extraction from Document Images Using Haar Discrete Wavelet Transform

نویسندگان

  • S. Audithan
  • RM. Chandrasekaran
چکیده

This paper presents an efficient and computationally fast method to extract text regions from documents. In this paper, we propose Haar discrete wavelet transform (DWT)[9] which operates the fastest among all wavelets because its coefficients are either 1 or -1. This is one of the reasons we employ Haar DWT to detect edges of candidate text regions. First, we detect edges and then line feature vector graph is generated based on the edge map and the stroke information is extracted. Finally text regions are generated and filtered according to line features. Experimental results show that, without increasing the computational cost, our proposed method could suppress the false alarms notably. Furthermore, our method can be easily customized for applications with different tradeoffs in recall and precision.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modified Method of Document Text Extraction from Document Images Using Haar DWT

This paper extends the technique used for Document Text Extraction from Images using 2-D Haar Wavelet. The discrete wavelet transform is a very useful tool for signal analysis and image processing, especially in multi-resolution representation. It can decompose signal into different components in the frequency domain. Two-dimensional discrete wavelet transform (2-D DWT) decomposes an input imag...

متن کامل

Text Extraction of Vehicle Number Plate and Document Images Using Discrete Wavelet Transform in MATLAB

Text Extraction from colour images is a challenging task in computer vision. The concept of text extraction is derived from the vehicle plate recognization and their characters extractions individually. Some examples of the applications are automatic image indexing, visual impaired people assistance or optical character reading, keyword searching in a document image. The continuous research has...

متن کامل

Image Segmentation for Text Extraction

This paper presents a methodology for extracting text from images such as document images, scene images etc. Text that appears in these images contains important and useful information. Text extraction in images has been used in large variety of applications such as mobile robot navigation, document retrieving, object identification, vehicle license plate detection, etc. In this paper, we emplo...

متن کامل

A Novel Method for Efficient Text Extraction from Real Time Images with Diversified Background using Haar Discrete Wavelet Transform and K-Means Clustering

The proposed system highlights a novel approach of extracting a text from image using two dimensional Haar Discrete Wavelet Transformation and K-Means Clustering. As the commercial usage of digital contents are on rise, the requirement of an efficient and error free indexing text along with text localization and extraction is of high importance. Majority of the previous research work on text ex...

متن کامل

Global Approach for Script Identification using Wavelet Packet Based Features

In a multi script environment, an archive of documents having the text regions printed in different scripts is in practice. For automatic processing of such documents through Optical Character Recognition (OCR), it is necessary to identify different script regions of the document. In this paper, a novel texture-based approach is presented to identify the script type of the collection of documen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009